306 research outputs found
Towards trustworthy phoneme boundary detection with autoregressive model and improved evaluation metric
Phoneme boundary detection has been studied due to its central role in
various speech applications. In this work, we point out that this task needs to
be addressed not only by algorithmic way, but also by evaluation metric. To
this end, we first propose a state-of-the-art phoneme boundary detector that
operates in an autoregressive manner, dubbed SuperSeg. Experiments on the TIMIT
and Buckeye corpora demonstrates that SuperSeg identifies phoneme boundaries
with significant margin compared to existing models. Furthermore, we note that
there is a limitation on the popular evaluation metric, R-value, and propose
new evaluation metrics that prevent each boundary from contributing to
evaluation multiple times. The proposed metrics reveal the weaknesses of
non-autoregressive baselines and establishes a reliable criterion that suits
for evaluating phoneme boundary detection.Comment: 5 pages, submitted to ICASSP 202
Differentiable Artificial Reverberation
Artificial reverberation (AR) models play a central role in various audio
applications. Therefore, estimating the AR model parameters (ARPs) of a target
reverberation is a crucial task. Although a few recent deep-learning-based
approaches have shown promising performance, their non-end-to-end training
scheme prevents them from fully exploiting the potential of deep neural
networks. This motivates to introduce differentiable artificial reverberation
(DAR) models which allows loss gradients to be back-propagated end-to-end.
However, implementing the AR models with their difference equations "as is" in
the deep-learning framework severely bottlenecks the training speed when
executed with a parallel processor like GPU due to their infinite impulse
response (IIR) components. We tackle this problem by replacing the IIR filters
with finite impulse response (FIR) approximations with the frequency-sampling
method (FSM). Using the FSM, we implement three DAR models -- differentiable
Filtered Velvet Noise (FVN), Advanced Filtered Velvet Noise (AFVN), and
Feedback Delay Network (FDN). For each AR model, we train its ARP estimation
networks for analysis-synthesis (RIR-to-ARP) and blind estimation
(reverberant-speech-to-ARP) task in an end-to-end manner with its DAR model
counterpart. Experiment results show that the proposed method achieves
consistent performance improvement over the non-end-to-end approaches in both
objective metrics and subjective listening test results.Comment: Manuscript submitted to TASL
Generation of Non-uniform Meshes for Finite-Difference Time-Domain Simulations
Abstract -In this paper, two automatic mesh generation algorithms are presented. The methods seek to optimize mesh density with regard to geometries exhibiting both fine and coarse physical structures. When generating meshes, the algorithms attempt to satisfy the conditions on the maximum mesh spacing and the maximum grading ratio simultaneously. Both algorithms successfully produce non-uniform meshes that satisfy the requirements for finite-difference time-domain simulations of microwave components. Additionally, an algorithm successfully generates a minimum number of grid points while maintaining the simulation accuracy
High-efficiency Bidirectional Buck-Boost Converter for Residential Energy Storage System
This paper proposes a bidirectional dc-dc converter for residential micro-grid applications. The proposed converter can operate over an input voltage range that overlaps the output voltage range. This converter uses two snubber capacitors to reduce the switch turn-off losses, a dc-blocking capacitor to reduce the input/output filter size, and a 1:1 transformer to reduce core loss. The windings of the transformer are connected in parallel and in reverse-coupled configuration to suppress magnetic flux swing in the core. Zero-voltage turn-on of the switch is achieved by operating the converter in discontinuous conduction mode. The experimental converter was designed to operate at a switching frequency of 40-210 kHz, an input voltage of 48 V, an output voltage of 36-60 V, and an output power of 50-500 W. The power conversion efficiency for boost conversion to 60 V was >= 98.3% in the entire power range. The efficiency for buck conversion to 36 V was >= 98.4% in the entire power range. The output voltage ripple at full load was <3.59 V-p.p for boost conversion (60 V) and 1.35 V-p.p for buck conversion (36 V) with the reduced input/output filter. The experimental results indicate that the proposed converter is well-suited to smart-grid energy storage systems that require high efficiency, small size, and overlapping input and output voltage ranges.11Ysciescopu
- …